ASYMP: Fault-tolerant Mining of Massive Graphs

نویسندگان

  • Eduardo Fleury
  • Silvio Lattanzi
  • Vahab S. Mirrokni
  • Bryan Perozzi
چکیده

We present ASYMP, a distributed graph processing system developed for the timely analysis of graphs with trillions of edges. ASYMP has several distinguishing features including a robust fault tolerance mechanism, a lockless architecture which scales seamlessly to thousands of machines, and ecient data access pa‹erns to reduce per-machine overhead. ASYMP is used to analyze the largest graphs at Google, and the graphs we consider in our empirical evaluation here are, to the best of our knowledge, the largest considered in the literature. Our experimental results show that compared to previous graph processing frameworks at Google, ASYMP can scale to larger graphs, operate on more crowded clusters, and complete real-world graph mining analytic tasks faster. First, we evaluate the speed of ASYMP, where we show that across a diverse selection of graphs, it runs Connected Component 3-50x faster than state of the art implementations in MapReduce and Pregel. ‘en we demonstrate the scalability and parallelism of this framework: €rst by showing that the running time increases linearly by increasing the size of the graphs (without changing the number of machines), and then by showing the gains in running time while increasing the number of machines. Finally, we demonstrate the fault-tolerance properties for the framework, showing that inducing 50% of our machines to fail increases the running time by only 41%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Large Scale Graph Mining With MapReduce: Diameter Estimation and Eccentricity Plots of Massive Graphs with Mining Applications

In recent years, a considerable amount of research has focused on the study of graph structures arising from technological, biological and sociological systems. Graphs are the tool of choice in modeling such systems since they are typically described as sets of pairwise interactions. Important examples of such datasets are the Internet, the Web, social networks, and large-scale information netw...

متن کامل

Fault Tolerant DNA Computing Based on ‎Digital Microfluidic Biochips

   Historically, DNA molecules have been known as the building blocks of life, later on in 1994, Leonard Adelman introduced a technique to utilize DNA molecules for a new kind of computation. According to the massive parallelism, huge storage capacity and the ability of using the DNA molecules inside the living tissue, this type of computation is applied in many application areas such as me...

متن کامل

An Efficient Approach for Mining Fault-Tolerant Frequent Patterns Based on Bit Vector Representations

In this paper, an algorithm, called VB-FT-Mine (Vectors-Based Fault–Tolerant frequent patterns Mining), is proposed for mining fault-tolerant frequent patterns efficiently. In this approach, fault–tolerant appearing vectors are designed to represent the distribution that the candidate patterns contained in data sets with fault-tolerance. VB-FT-Mine algorithm applies depth-first pattern growing ...

متن کامل

False Alarms in Fault-tolerant Dominating Sets in Graphs

We develop the problem of fault-tolerant dominating sets (liar’s dominating sets) in graphs. Namely, we consider a new kind of fault – a false alarm. Characterization of such fault-tolerant dominating sets in three different cases (dependent on the classification of the types of the faults) are presented.

متن کامل

An Efficient Approach for Mining Top-K Fault-Tolerant Repeating Patterns

In this paper, an efficient strategy for mining top-K non-trivial faulttolerant repeating patterns (FT-RPs in short) with lengths no less than min_len from data sequences is provided. By extending the idea of appearing bit sequences, fault-tolerant appearing bit sequences are defined to represent the locations where candidate patterns appear in a data sequence with insertion/deletion errors bei...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1712.09731  شماره 

صفحات  -

تاریخ انتشار 2017